A Concise Information-Theoretic Derivation of the Baum-Welch algorithm

نویسندگان

Alireza Nejati

C. P. Unsworth

چکیده

We derive the Baum-Welch algorithm for hidden Markov models (HMMs) through an information-theoretical approach using cross-entropy instead of the Lagrange multiplier approach which is universal in machine learning literature. The proposed approach provides a more concise derivation of the Baum-Welch method and naturally generalizes to multiple observations. Introduction The basic hidden Markov model (HMM)[5] is defined as having a sequence of hidden or latent states Q = {qt} = {q1, q2, . . . , qT } (where t denotes time interval), and each state is statistically independent of all but the state immediately before it and where each state emits some observation ot with a stationary (non-time-varying) probability density. Formally, the model is defined as: p(O,Q|λ) = p(q1|λ) [ T ∏ t=2 p(qt|qt−1, λ) ] [ T ∏

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparing the Bidirectional Baum-Welch Algorithm and the Baum-Welch Algorithm on Regular Lattice

A profile hidden Markov model (PHMM) is widely used in assigning protein sequences to protein families. In this model, the hidden states only depend on the previous hidden state and observations are independent given hidden states. In other words, in the PHMM, only the information of the left side of a hidden state is considered. However, it makes sense that considering the information of the b...

متن کامل

Generalized Baum-Welch and Viterbi Algorithms Based on the Direct Dependency among Observations

The parameters of a Hidden Markov Model (HMM) are transition and emission probabilities&lrm;. &lrm;Both can be estimated using the Baum-Welch algorithm&lrm;. &lrm;The process of discovering the sequence of hidden states&lrm;, &lrm;given the sequence of observations&lrm;, &lrm;is performed by the Viterbi algorithm&lrm;. &lrm;In both Baum-Welch and Viterbi algorithms&lrm;, &lrm;it is assumed that...

متن کامل

4.1 Overview

In this lecture, we will address problems 3 and 4. First, continuing from the previous lecture, we will view BaumWelch Re-estimation as an instance of the Expectation-Maximization (EM) algorithm and prove why the EM algorithm maximizes data likelihood. Then, we will proceed to discuss discriminative training under the maximum mutual information estimation (MMIE) framework. Specifically, we will...

متن کامل

Discriminative speaker adaptation with conditional maximum likelihood linear regression

We present a simplified derivation of the extended Baum-Welch procedure, which shows that it can be used for Maximum Mutual Information (MMI) of a large class of continuous emission density hidden Markov models (HMMs). We use the extended Baum-Welch procedure for discriminative estimation of MLLR-type speaker adaptation transformations. The resulting adaptation procedure, termed Conditional Max...

متن کامل

Model Theoretic Reformulation of the Baum-connes and Farrell-jones Conjectures

The Isomorphism Conjectures are translated into the language of homotopical algebra, where they resemble Thomason’s descent theorems.

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

CoRR

دوره abs/1406.7002 شماره

صفحات -

تاریخ انتشار 2014

A Concise Information-Theoretic Derivation of the Baum-Welch algorithm

نویسندگان

چکیده

منابع مشابه

Comparing the Bidirectional Baum-Welch Algorithm and the Baum-Welch Algorithm on Regular Lattice

Generalized Baum-Welch and Viterbi Algorithms Based on the Direct Dependency among Observations

4.1 Overview

Discriminative speaker adaptation with conditional maximum likelihood linear regression

Model Theoretic Reformulation of the Baum-connes and Farrell-jones Conjectures

عنوان ژورنال:

اشتراک گذاری